Inverse Reinforcement Learning through Structured Classification

نویسندگان

Edouard Klein

Matthieu Geist

Bilal Piot

Olivier Pietquin

چکیده

This paper adresses the inverse reinforcement learning (IRL) problem, that is inferring a reward for which a demonstrated expert behavior is optimal. We introduce a new algorithm, SCIRL, whose principle is to use the so-called feature expectation of the expert as the parameterization of the score function of a multiclass classifier. This approach produces a reward function for which the expert policy is provably near-optimal. Contrary to most of existing IRL algorithms, SCIRL does not require solving the direct RL problem. Moreover, with an appropriate heuristic, it can succeed with only trajectories sampled according to the expert behavior. This is illustrated on a car driving simulator.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

From Structured Prediction to Inverse Reinforcement Learning

Machine learning is all about making predictions; language is full of complex rich structure. Structured prediction marries these two. However, structured prediction isn’t always enough: sometimes the world throws even more complex data at us, and we need reinforcement learning techniques. This tutorial is all about the how and the why of structured prediction and inverse reinforcement learning...

متن کامل

Beyond Structured Prediction: Inverse Reinforcement Learning

متن کامل

Structured Classification for Inverse Reinforcement Learning

This paper addresses the Inverse Reinforcement Learning (IRL) problem which is a particular case of learning from demonstrations. The IRL framework assumes that an expert, demonstrating a task, is acting optimally with respect to an unknown reward function to be discovered. Unlike most of existing IRL algorithms, the proposed approach doesn’t require any of the following: complete trajectories ...

متن کامل

On Correcting Inputs: Inverse Optimization for Online Structured Prediction

Algorithm designers typically assume that the input data is correct, and then proceed to find “optimal” or “sub-optimal” solutions using this input data. However this assumption of correct data does not always hold in practice, especially in the context of online learning systems where the objective is to learn appropriate feature weights given some training samples. Such scenarios necessitate ...

متن کامل

Around Inverse Reinforcement Learning and Score-based Classification

Inverse reinforcement learning (IRL) aims at estimating an unknown reward function optimized by some expert agent from interactions between this expert and the system to be controlled. One of its major application fields is imitation learning, where the goal is to imitate the expert, possibly in situations not encountered before. A classic and simple way to handle this problem is to see it as a...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2012

Inverse Reinforcement Learning through Structured Classification

نویسندگان

چکیده

منابع مشابه

From Structured Prediction to Inverse Reinforcement Learning

Beyond Structured Prediction: Inverse Reinforcement Learning

Structured Classification for Inverse Reinforcement Learning

On Correcting Inputs: Inverse Optimization for Online Structured Prediction

Around Inverse Reinforcement Learning and Score-based Classification

عنوان ژورنال:

اشتراک گذاری